41 research outputs found
The Application of Preconditioned Alternating Direction Method of Multipliers in Depth from Focal Stack
Post capture refocusing effect in smartphone cameras is achievable by using
focal stacks. However, the accuracy of this effect is totally dependent on the
combination of the depth layers in the stack. The accuracy of the extended
depth of field effect in this application can be improved significantly by
computing an accurate depth map which has been an open issue for decades. To
tackle this issue, in this paper, a framework is proposed based on
Preconditioned Alternating Direction Method of Multipliers (PADMM) for depth
from the focal stack and synthetic defocus application. In addition to its
ability to provide high structural accuracy and occlusion handling, the
optimization function of the proposed method can, in fact, converge faster and
better than state of the art methods. The evaluation has been done on 21 sets
of focal stacks and the optimization function has been compared against 5 other
methods. Preliminary results indicate that the proposed method has a better
performance in terms of structural accuracy and optimization in comparison to
the current state of the art methods.Comment: 15 pages, 8 figure
High-Accuracy Facial Depth Models derived from 3D Synthetic Data
In this paper, we explore how synthetically generated 3D face models can be
used to construct a high accuracy ground truth for depth. This allows us to
train the Convolutional Neural Networks (CNN) to solve facial depth estimation
problems. These models provide sophisticated controls over image variations
including pose, illumination, facial expressions and camera position. 2D
training samples can be rendered from these models, typically in RGB format,
together with depth information. Using synthetic facial animations, a dynamic
facial expression or facial action data can be rendered for a sequence of image
frames together with ground truth depth and additional metadata such as head
pose, light direction, etc. The synthetic data is used to train a CNN based
facial depth estimation system which is validated on both synthetic and real
images. Potential fields of application include 3D reconstruction, driver
monitoring systems, robotic vision systems, and advanced scene understanding
Towards Synthetic Generation of Clinical Rosacea Images with GAN Models
Computer-aided skin disease diagnosis has recently attracted much attention in the scientific and medical research community due to advances in computer vision and machine learning algorithms. These methodologies essentially rely on large datasets collected from hospitals and medical professionals. Data scarcity is a vital problem in the medical domain, especially facial skin conditions, due to privacy concerns. For instance, some facial skin conditions, e.g. Rosacea, require observation of the entire face, which reveals the patient's identity. Rosacea is a lamentably neglected skin condition in the computer-aided diagnosis research community, due to the limited availability of Rosacea datasets. Hence, there is a need for exploring alternative ways to deal with the limited available data for Rosacea. A common approach to expanding small datasets is to utilise augmentation techniques. One of the most powerful augmentation methods in machine learning is Generative Adversarial Networks (GANs). Recently, GANs, principally the variants of StyleGAN, have successfully generated synthetic facial images. In this paper, a small dataset of a particular skin disease, Rosacea, with 300 images is used to examine the potential of a variant of StyleGAN known as StyleGAN2-ADA. The preliminary experiments and evaluations show promising signs towards addressing the data scarcity for computer-aided Rosacea diagnosis
Skin disease analysis with limited data in particular Rosacea: a review and recommended framework
Recently, the rapid advancements in Deep Learning and Computer Vision technologies have introduced a new and exciting era in the field of skin disease analysis. However, there are certain challenges in the roadmap towards developing such technologies for real-life applications that must be investigated. This study considers one of the key challenges in data acquisition and computation, viz. data scarcity. Data scarcity is a central problem in acquiring medical images and applying machine learning techniques to train Convolutional Neural Networks for disease diagnosis. The main objective of this study is to explore the possible methods to deal with the data scarcity problem and to improve diagnosis with small datasets. The challenges in data acquisition for a few lamentably neglected skin conditions such as rosacea are an excellent instance to explore the possibilities of improving computer-aided skin disease diagnosis. With data scarcity in mind, the possible techniques explored and discussed include Generative Adversarial Networks, Meta-Learning, Few-Shot classification, and 3D face modelling. Furthermore, the existing studies are discussed based on skin conditions considered, data volume and implementation choices. Some future research directions are recommended
Identifying Candidate Spaces for Advert Implantation
Virtual advertising is an important and promising feature in the area of
online advertising. It involves integrating adverts onto live or recorded
videos for product placements and targeted advertisements. Such integration of
adverts is primarily done by video editors in the post-production stage, which
is cumbersome and time-consuming. Therefore, it is important to automatically
identify candidate spaces in a video frame, wherein new adverts can be
implanted. The candidate space should match the scene perspective, and also
have a high quality of experience according to human subjective judgment. In
this paper, we propose the use of a bespoke neural net that can assist the
video editors in identifying candidate spaces. We benchmark our approach
against several deep-learning architectures on a large-scale image dataset of
candidate spaces of outdoor scenes. Our work is the first of its kind in this
area of multimedia and augmented reality applications, and achieves the best
results.Comment: Published in Proc. IEEE 7th International Conference on Computer
Science and Network Technology, 201